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ABSTRACT 

Artificial neural networks have been used for a number of years to process holography-generated characteristic patterns of 
vibrating structures. This technology depends critically on the selection and the conditioning of the training sets. A scaling 
operation called folding is discussed for conditioning training sets optimally for training feed-forward neural networks to 
process characteristic fringe patterns. Folding allows feed-forward nets to be trained easily to detect damage-induced 
vibration-displacement-distribution changes as small as 10 nanometers. A specific application to aerospace of neural-net 
processing of characteristic patterns is presented to motivate the conditioning and optimization effort. 


1. INTRODUCTION 

Neural-Net processing of characteristic patterns of vibrating structures is used routinely for non-destructive evaluation. **“ 
The characteristic patterns (Figure 1) are generated using electronic time-average holography of the vibrating structure and 
are sub-sampled before processing. The lower resolution patterns containing a few hundred to a few thousand pixels 
(Figure 2) are then presented to an experimentally trained neural network. The neural network is trained to detect small 
changes in the characteristic patterns resulting, for example, from structural changes or damage. 

The neural-net, electronic-holography combination, used at NASA Glenn Research_Center to detect structural changes and 
damage, has evolved through several stages. The current combination is experimentally trained; is immune to the laser 
speckle effect; can be used with fiber scopes; operates at 30 frames per second; and uses feed forward artificial neural 
networks (multi-layer perceptrons) very efficiently. The feed-forward architecture (Figure 3) is probably the most familiar 
architecture for so-called artificial neural networks and has much to recommend it. An artificial neural network is defined 
to be any processing system that is programmed with a training set of exemplars. As a specific example, the feed-forward 
net remains compact in software as the size of that training set increases; has good noise immunity; and can be trained with 
straightforward algorithms of the back-propagation genre. The feed-forward net can process fairly large input images, if the 
number of hidden-layer nodes is not too large, and is w r ell suited to processing speckled characteristic patterns at 30 frames 
per second when those patterns contain a few hundred to a few thousand pixels. 

Feed-forward artificial neural networks or multi-layer perceptrons do have a reputation at times for being unable to learn 
training sets that are deemed otherwise to be leamable. Research continues to be conducted in measuring the extent to 
which a training set is leamable. 34 Nevertheless, it has been known for some time that the performance of feed forward 
artificial neural networks can be enhanced greatly by conditioning the inputs. The proprietary functional-link net 
transforms inputs mathematically, before subjecting them to the back propagation algorithm. 5 Another practice that 
improves learning is to scale the individual pixels of the training exemplars to cover the entire input range of the feed 
forward net. So-called min-max tables of the minimum and maximum pixel values are used for scaling. Learning of 
characteristic patterns does improve with positional scaling, but the nets are susceptible to over- training. 

There is a better way called folding to condition characteristic patterns. In folding, input pixels are scaled according to their 
locations in an intensity range rather than their positions in a characteristic pattern. Folding greatly increases the sensitivity 
of a feed-forward net for detecting changes in a characteristic pattern. This paper will discuss the learning-performance 
improvement achieved with folding. 
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We begin by summarizing the experimentally trained non-destructive-evaluation (NDE) method that is the motivation for 
improving the learning performance of the feed-forward nets. A discussion of the NDE method is used to introduce the 
appropriate training-record format consisting of a speckled characteristic pattern as input and a so-called degradable 
classification index (DCI) as output. Next, a structural model is used to create training sets containing exemplars with 
accurately known structural changes or damage. The training sets are then subjected to three kinds of conditioning. The 
conditioning consists of: the unconditioned training sets, the scaled training sets and the folded training sets. The 

conditioned training sets are then used to train feed-forward nets. Nets that successfully learn the training sets are tested 
using speckle patterns that differ from the training patterns. The relative performances of the different kinds of conditioning 
are compared. 



CRACK 


Figure 1: Characteristic pattern for a cracked blade. 




Figure 3: Feed-forward architecture. 


NAS A/TM— 200 1 -2 1 0979 


2 


2. TRAINING WITH EXPERIMENTALLY DERIVED RECORDS 


2.1 Format of training records 

A very effective experimental method has been developed for training neural networks to detect structural damage. 1 '" The 
method can be described algorithmically, and has been used to inspect an International Space Station (ISS) instrumentation 
cold plate for pressure-cycle induced damage. The objective of this paper is to discuss how to condition the training records 
to maximize the damage-detection sensitivity of this method. 

A training record consists of an input and an output. The input is a characteristic pattern recorded of a vibrating structure 
excited to vibrate at very low amplitude. The structure is excited to vibrate in a normal or resonant mode, and the 
characteristic pattern shows the mode shape. A characteristic pattern is generated using electronic or television holography. 
Television holography is available commercially in more than one form, and is discussed extensively in the literature. Most 
applications use multiple holograms to average out electronic or speckle-effect noise, or to extract quantitative vibration- 
displacement data. The neural-net application uses only 2 holograms and consequently maintains a high sampling rate. The 
two holograms differ only by a n relative phase shift of the reference beam. The absolute value of the difference between 
the 2 holograms provides the visualization such as shown in fig. 1. 

Training inputs actually consist of sub-sampled versions of the full TV resolution patterns such as shown in fig. 1 . Sub- 
sampling begins by dividing the structure into large pixels. A few hundred to a few thousand large pixels are selected. 
Sampling is then accomplished without averaging. That is, one full-resolution pixel is recorded for each large pixel. This 
approach permits rapid recording of a large number of independent speckle patterns for each characteristic pattern; since 
there are many full-resolution pixels within a large pixel. It has been shown 6 that neural nets become insensitive to the 
details of the speckle pattern, if uncorrelated speckle patterns, equal in number to 10 percent of the number of large pixels, 
are used to train the net to recognize each characteristic pattern. Figure 2 shows a sub-sampled characteristic pattern of the 
blade in fig. 1, but at a smaller excitation amplitude. 

At low frequencies, characteristic patterns or mode shapes are especially sensitive to boundary conditions and are sensitive 
to internal structural details and changes. The objective is to use the neural net to detect and flag such changes. The neural 
net uses an output to indicate the extent to which a mode has changed from a training mode as a result of structural changes 
or damage. The output used is a so-called degradable classification index (DCI). The DCI degrades or changes gradually 
as the mode shape changes from the original training shape. The DO is encoded with 2 or 3 neural-net output nodes. The 
simplest example would consist of the output pair (1, 0) for a mode that was completely identical with the training mode 
and (0, 1) for a mode shape that differed completely from the training mode. The output would change gradually between 
the pairs as the mode shape changed gradually. This simple approach actually works very well, if the following algorithmic 
training procedure is employed. 

2.2 Training procedure 

The training and non-destructive-evaluation procedures are discussed more completely in the references. 1 ’* The following is 
a brief, but useable, description. The description is reduced to an eight-step procedure. 

First, select about 5 vibration modes that cover the region of interest. Figure 4 shows vibration modes covering a region of 
suspicious structural integrity between 4 bolt holes in an Intemational-Space-Station (ISS) cold plate. 

Second, select 3 of these modes together with the zero-amplitude condition, and collect enough independent speckle 
patterns for each. Figure 5 shows one speckle-pattern sample for the zero-amplitude condition. There w ? ere about 2000 
large pixels for this test; hence, about 200 independent speckle-pattems-per-mode were required to train the net for speckle- 
pattern immunity. 
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Figure 4: Modes from cold-plate. 


Figure 5: Sub-sampled region. 


Figure 6: Mode monitored for damage. 


Third, select an appropriate feed-forward neural-net architecture. The method for selecting an optimum architecture is 
discussed in a reference. 6 A 3-layer architecture is selected most often, although occasionally a 4-layer net will perform 
best. The input layer requires one node for each pixel. Hence, about 2000 input nodes would be required for the cold-plate 
example. The second layer or hidden layer contains very few nodes. It’s generally desirable to use as few nodes as possible 
to minimize the chance of over training. Typically, 10 nodes or fewer are required. The output layer encodes the DCI and 
contains perhaps 2 nodes. The nodes in the hidden and output layers form a linear combination of their inputs and 
transform this sum non-linearly using for example a sigmoid function to generate an output. It may be necessary to 
transform outputs such as (1, 0) into the range [0.8, 0.2] for the sigmoid function. 

Fourth, select the mode to be excited at low amplitude and monitored during a test. This mode is arbitrarily assigned 
the DCI (1, 0) or (0.8, 0.2). The other two modes and the zero-amplitude condition are then assigned the DCI (0, 1) or 
(0.2, 0.8) . Figure 6 shows the sub-sampled mode that was actually monitored for the ISS cold-plate test. Only part of the 
mode is visible between the bolt holes. 

Fifth, train the neural net to an RMS error of 0.01. The RMS error is computed from the squares of the differences between 
the training outputs and the measured outputs for all training records. 

Sixth, define or select a value of the output node that constitutes a significant change from the training input. 
Environmental factors and alignment will change the DCI somewhat and randomly. A change of the large index from 0.8 to 
0.7 may be considered to be significant. 

Seventh, test the sensitivity of the trained net to structural changes. Typically, small point loads are applied at strategic 
locations on the structure, and the response of the net is noted. The design of this step is a question of structural testing 
rather than optical metrology. Currently, little more than the judgments of the test personnel are used to evaluate the 
outcome of this test. 

Eighth, select a new set of vibration modes, and repeat steps one through seven, if the response of the net is not sufficiently 
sensitive. 

The eight-step procedure has been automated with software, and a complete cycle takes about 20 minutes. 

The procedure suggests at least 3 R&D questions. How do you use, quantify, or calibrate the eight-step procedure for 
structural NDE? How do you select a training set for optimum detection of structural changes? How do you maximize the 
sensitivity of a feed-forward net for learning the training set and detecting structural changes? 

This paper is concerned only with maximizing sensitivity, but structural finite element models can be used in part to 
formulate answers to all three questions. We shall use finite -element- model-generated training sets to investigate the 
conditioning of inputs to optimize the training of feed-forw ard nets. These training sets are described in the next section. 
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3. MODEL-GENERATED TRAINING SETS FOR CHECKING CONDITIONING TECHNIQUES 


Structural finite-element models have been combined with a model of the laser speckle effect and a model of electronic 
holography to generate characteristic patterns, and these patterns have been used to train artificial neural networks. The 
practice has had limited success. Neural nets trained with a cantilever model were able to detect damage in a physical 
cantilever. 7 But neural nets trained with characteristic patterns generated from deterministic finite-element models of 
twisted fan blades have been unable to recognize or detect damage in the physical fan blades. 8,9 The calculated characteristic 
patterns are too sensitive to boundary conditions and to sample-particular structural properties to represent real structures 
accurately enough for our NDE method. 

Nevertheless, this paper assumes, supported by the success of the cantilever example, that model-generated changes in 
patterns are sufficiently realistic to be used to test the effectiveness of input conditioning on neural-net training. A model is 
used to generate vibration displacement distributions for undamaged and cracked twisted blades. The size of the crack 
effect is varied to check sensitivity. The displacement distributions are combined with models of electronic holography and 
the laser speckle effect to generate characteristic patterns. These characteristic patterns can then be transformed in various 
ways to measure the effects of input conditioning on neural-net training and sensitivity. 

The model is discussed in the references 8 9 , and the features needed to understand the training sets are summarized briefly 
here. Both cantilever and twisted-blade models have been used to investigate the effects of conditioning, but this study will 
be confined to a twisted-blade model. The blade and finite-element node pattern are shown in fig. 7. Figure 1 show's a 
characteristic pattern and fig. 2 shows a sub-sampled pattern from the twisted blade. The blade geometry is of constant 
cross-section and has a twist that varies linearly from 0 degrees at the root to 30 degrees at the tip. Blade dimensions are 
chord, 8.72 cm (3.433 in); maximum thickness to chord ratio, 0.037; and span, 15.24 cm (6.0 in). The finite element 
models have a 20x42 mesh of quadrilateral elements along the mid-thickness of the airfoil section. The finite element model 
(MSC/NASTRAN Solution 103) tabulates relative vector displacements at 21x43 finite-element nodes. Only the lowest 
frequency mode predicted at 199 Hz was used for this study. The blade material for this prediction was 6061-T6 Aluminum 
with a Young’s Modulus of 66.19 Gpa (9.6xl0 6 psi), a Poisson’s Ratio of .33, and a Mass Density of 2712.832 kg/m 3 
(2.536x1 0 4 lbf sec 2 /in 4 ). 



Figure 7: Blade and finite -element node-pattern used for model. 


Two finite element models were generated, one with a simulated crack and the other without. The crack is located at the 
root and extends from 87% to 100% of chord. The blades were structurally modeled as cantilevers by constraining the root 
nodes in all six degrees of freedom, except in the simulated crack region. The crack was simulated by releasing the 
constraints for all degrees of freedom at the nodes in the crack’s region. 

Another feature called a crack-effect amplification factor f was added only to check optical sensitivity. The change in 
displacement distribution between the cracked and undamaged blades is multiplied by this factor. When f=l, the model is 
used as is. When f > 1 , the optical effect of the crack is greater than predicted by the finite element model. When f<l, the 
optical effect of the crack is less than predicted by the finite element model. 

The finite element model must be combined with 2 optical effects. First, the laser speckle effect must be modeled. The 
simplest model w^as chosen where the real and imaginary parts of the object-beam amplitude were normally distributed. 
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Random number generators assured uncorrelated speckle patterns. Second, a sensitivity vector K (equation 1) must be 
included for the holography process. The sensitivity vector can be unfavorable for parts of a highly twisted blade, but has 
little effect on the blade used for this study. 

The model was used to generate one pixel for each point at which the displacement was tabulated. A complication evident 
in fig. 7 is that the points are non-uniformly placed. This effect is specifically handled by our software and measurements. 

A final comment is that calculated holograms of both the whole blade and a zoomed region have been used for past tests. 
The zoomed displacement distributions were often required during previous work to generate adequate sensitivity, and were 
generated by cubic interpolation of the finite-element model. This study is confined to the whole blade. The next section 
discusses the transformation of input characteristic patterns to improve neural-net sensitivity. 

4. TRANSFORMATIONS OF CHARACTERISTIC PATTERNS 

The raw input data actually is a signed, speckled pattern, or cross-interference term, whose individual pixels satisfy the 
expression 


pixel value = A cos[0] J 0 [2;tK*5] 


( 1 ) 


where A is a positive random quantity; 0 is random variable uniformly distributed from 0 to 2 n; J 0 is the Bessel function of 
the first kind and zero order; K is the sensitivity vector; and 5 is the vibration displacement amplitude measured in 
wavelengths of light. 

Many transformations of this quantity are possible, but the practice for a long time was to take the absolute value. Then 
pixel values could be arranged from 0 to 255 for visualization. The same absolute values were used to train the neural 
networks as well. 

It was conjectured, of course, that taking the absolute value might eliminate information whose loss might degrade the 
learning and sensitivity of the neural networks. But the surprising finding was that feed-forward nets trained with the 
signed quantity were harder to train than nets trained with the absolute value. A hyperbolic tangent function is used rather 
than a sigmoid function for signed inputs. Figure 8 shows that the absolute-value transformation is a folding operation 
about zero intensity. In relative terms, intensities in the range [-1, 0) are transformed into the range [1,0), and intensities in 
the range [0, 1] are transformed identically to [0, 1], It was recognized that this symmetrical process could be continued. 
For example, figure 9 shows a transformation of [-1.0 -0.5] into the range [1.0, 0], a transformation of 
[-0.5, 0] into the range [0, 1.0], a transformation of [0, 0.5] into the range [1.0, 0], and a transformation of [0.5, 1.0] into the 
range [0, 1.0]. It was discovered that feed-forward nets trained with this folded data learned more easily than nets trained 
with the absolute value, and that learning can improve as the number of folds increases. 





Figure 8: 1 fold or absolute value. Figure 9: 3 folds with normal contrast. FigurelO: 3 folds and reversed contrast. 

Folding can be made non-symmetrical and non-uniform, but this paper will cover the symmetrical and uniform case only. 
The symmetrical case requires that the number of folds N be odd. Here N=0 corresponds to unfolded, signed 
characteristic patterns and N=1 corresponds to the absolute value. Figure 9 has N=3, Folding can also be accompanied 
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by a contrast reversal as in figure 10. In general, we shall define transformations in which dark fringes remain dark as 
having normal contrast. Normal contrast requires that zero intensities be transformed to zero. 

Folding is an intensity dependent transformation. For comparison, we shall use another transformation called a min-max 
table. For a min-max table, all the intensities in the training set at each point in the characteristic pattern are tabulated. The 
difference between the minimum and maximum intensities at each point is then used to scale the data into the full input 
range of the neural network. That range typically is [0.2, 0.8] for a feed-forward net that uses a sigmoid transfer function. 
The min-max table, unlike folding, will scale even a dark fringe into the full input range of the neural network. 

The next section presents results that show the performance improvement afforded by folding. 

5. RESULTS USING TRANSFORMED TRAINING RECORDS 

Training sets were generated for maximum vibration amplitudes of 1.0 wave, 5.0 waves and 64.0 waves, and for crack- 
effect amplification factors f=1.0 and f=0.1. Recall that f=1.0 refers to the model-generated cracks, and f=0.1 refers to 
crack effects 10 times smaller. One wave equals 1 wavelength of light. Training and test sets were generated for 0, 1,3, 5, 7, 
and 9 folds, where N=0 represents the signed, unfolded data. The training and tests sets have uncorrelated speckle patterns. 
The N=0 case was also used to train a net with min-max scaling for comparison. 

The same neural-net architecture was used for all tests. The feed-forward neural network had an input layer containing 903 
nodes, a hidden layer containing 5 nodes and an output layer containing 3 nodes to encode the DCI. One of the 3 output 
nodes is intended as a no-decision indicator, but was not used for this study, where it was clamped to 0.2. The nets were 
always trained with 1 1,000 back-propagation iterations. After training, RMS errors were measured for both the training and 
test sets. The identification error rate was also noted. An identification error was declared, if the maximum node indicated 
an output below 0.6. Note that the training value of a maximum node is 0.8. The DCI triple (0.8, 0.2, 0.2) was used as the 
training value for the undamaged case, and the DCI triple (0.2, 0.2, 0.8) was used as the training value for the cracked case. 

Figure 1 1 clearly shows the effectiveness of folding in improving the performance of a feed-forward neural network. Figure 
1 1 was generated for an amplitude of 5.0 waves and a crack-effect amplification factor f=0. 1. The net does not train at all 
until there are 7 folds, and the 9-fold case performs better than the 7-fold case. 
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Figure 11: Performance of folding for an excitation amplitude of 5.0 waves and a crack-effect amplification f =0.1. 


The same result pertains to other combinations of crack-effect amplification and vibration amplitude. Figure 12 shows the 
result for an amplitude of 1.0 wave and a crack-effect amplification factor f=1.0. Nine folds are required here for a 0% 
error rate; however the error rate is down to 5% at 3 folds. 


NASA/TM — 200 1 -2 1 0979 


7 



TRAINING ERROR 


TEST ERROR 


PERCENT ERROR RA TE 




k 300 

0 . 2 ** 

* 0.25 




50 

0.3 

0.2 




60 

0.15 

0.15 

* 

0.1 

0.1 

40 



• 

0.05 

0.05 



■ . A _ 



2 4 5 9 

2 4 6 8 


FOLDS 

Figure 12: Performance of folding for an excitation amplitude of 1.0 wave and a crack-effect amplification f=1.0. 

In general, the performance of the feed-forward net improves for large vibration amplitudes. Figure 13 shows the result for 
an amplitude of 64.0 waves and a crack-effect amplification factor f=1.0. All folded examples perform adequately at the 
higher amplitude. 
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Figure 13: Performance of folding for an excitation amplitude of 64.0 waves and a crack-effect amplification f=1.0. 


In general, the unfolded case (N=0) does not perform well or even adequately. Furthermore, min-max table scaling does 
not help. Min-max scaling of the 64.0 wave, f=i.O case yields a training error of 0.036, a test error of 0.9916, and an error 
rate of 80 percent. 


6. DISCUSSION OF RESULTS 

Figure 11 clearly shows, not only that folding improves the performance of the feed-forward neural network, but also that 
folding is absolutely essential for training in some cases. Neither the unsigned characteristic pattern (N=0), nor the absolute 
value (N=l) used most often for training the nets in the past, were able to train the net for fig. 1 1 . 

A natural question arises. Just how sensitive is the feed-forward-net, folding combination for detecting structural changes? 
This paper can provide an answer only for the model-generated characteristic patterns and model-calculated damage 
reported herein. That damage refers specifically to a crack at 0 span. Nevertheless, the results are reasonably impressive. A 
net was trained for f=0.01 and a vibration amplitude of 64.0 waves. Nine folds (N=9) were employed. The net was allowed 
to develop a minimum RMS training error of 0.01. The test RMS error was fairly high (0.2504), but the actual 
identification error rate was only 20 percent. The large test error was contributed mainly by values of the large node greater 
than 0.8, resulting still in largely correct identifications. Hence, this net, training-set combination probably was performing 
at the limit of detection. A symptom of impending net failure is an increasing test-error. Complete failure occurs when the 
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training error increases to a large value. The damage detected would have produce a maximum change in the displacement 
distribution of less than 10 nanometers in an experimental test. 

A final comment is that holography actually detects the dot product of the sensitivity vector and the displacement vector as 
shown in equation (1). This paper used a distribution of the sensitivity vector characteristic of a typical setup in a NASA 
Glenn holography laboratory used to inspect the physical fan blade whose model was used for this study. The minimum 
dot-product effect was still about 95 percent of the maximum possible effect, mainly because most of the motion was in the 
direction of the optical axis. 


7. CONCLUDING REMARKS 

The conclusion is that folding greatly improves the performance of feed-forward nets for learning speckled characteristic- 
pattern training records and is absolutely essential for learning to differentiate training records corresponding to small 
structural changes. Vibration-displacement-distribution changes as small as 10 nanometers can be detected at the 
maximum-difference point. 

Future work pertains mainly to non-destructive evaluation and structural analysis practices. There is a need to quantify and 
calibrate the inspection method discussed in Section 2 and to render it consistent with NASA standards. It would also be 
interesting to measure the susceptibility-to-leaming of the characteristic -pattern training records independently of the 
neural-net architecture. 
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